Pattern Matching by means of Multi-Resolution Compression
نویسندگان
چکیده
Introduction: The problem of compressed pattern matching deals with ways to find a pattern within a compressed file, without decompressing it first. The techniques for solving this problem fall into two major categories – creating a unique compression scheme that enables efficient pattern matching; or using some known compression scheme and develop algorithms to search the files it produces. In our work we have selected the first approach, trading off compression for fast pattern matching. Multi-resolution coding: In a multi-resolution coding scheme, a file is coded in several layers, where the first layer reveals only partial information about the file, and each layer adds information until the file is completely decoded. In order to implement such coding in text files, we divide the alphabet to non overlapping subgroups. The first encoded layer is a list of codes denoting the subgroup to which each symbol belong. The second encoded layer resolves the specific symbol within each subgroup (we used only two levels of resolution, but the method can be adopted to more levels). The problem of optimal subgroup division has previously been encountered in the as an optimization problem of the CodePack compression scheme, and has been previously addressed by the authors. Searching in multi-resolution encoded files: An important property of the coding method mentioned is that a string over the source alphabet is converted into a unique list of subsets. In the other direction the uniqueness property, of course, does not hold. Using this property the following algorithm is used:
منابع مشابه
Multi-frame Super Resolution for Improving Vehicle Licence Plate Recognition
License plate recognition (LPR) by digital image processing, which is widely used in traffic monitor and control, is one of the most important goals in Intelligent Transportation System (ITS). In real ITS, the resolution of input images are not very high since technology challenges and cost of high resolution cameras. However, when the license plate image is taken at low resolution, the license...
متن کاملLocal Derivative Pattern with Smart Thresholding: Local Composition Derivative Pattern for Palmprint Matching
Palmprint recognition is a new biometrics system based on physiological characteristics of the palmprint, which includes rich, stable, and unique features such as lines, points, and texture. Texture is one of the most important features extracted from low resolution images. In this paper, a new local descriptor, Local Composition Derivative Pattern (LCDP) is proposed to extract smartly stronger...
متن کاملBitmap reconstruction for document image compression
We introduce a pattern matching algorithm and a bitmap reconstruction method used in document image compression. This pattern matching algorithm uses the cross entropy between two patterns as the criterion for a match. We use a physical model which is based on the nite resolution of the scanner (spatial sampling error) to estimate the probability values used in cross entropy calculation. The ma...
متن کاملEntropy-based pattern matching for document image compression
In this paper, we introduce a pattern matching algorithm used in document image compression. This pattern matching algorithm uses the cross entropy between two patterns as the criterion for a match. We use a physical model which is based on the nite resolution of the scanner (spatial sampling error) to estimate the probability values used in cross entropy calculation. Experimental results show ...
متن کاملTrie Compression for GPU Accelerated Multi-Pattern Matching
Graphics Processing Units (GPU) allow for running massively parallel applications offloading the Central Processing Unit (CPU) from computationally intensive resources. However GPUs have a limited amount of memory. In this paper, a trie compression algorithm for massively parallel pattern matching is presented demonstrating 85% less space requirements than the original highly efficient parallel...
متن کامل